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Method for multi- levels distributed speech 
recognition* 
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3) 



3* What was the problem(s) to be solved by the invention or what was the need(s) for tne 
invention : 

Speech recognition in handheld devices is very limited because of processing and information 
storage limitations (i.e. the PIW on a phone is small). On the other hand, centralized, 
network-based servers can recognize a much larger vocabulary (i.e. a large PIM, dynamic real- 
time information) but this is slow because of network constraints and higher processing needs, 
The invention enables fast, cost-effective, and comprehensive speech recognition through a 
multi-level, distributed approach. 

4. What is the prior art, and why doesn't it resolve the problem(s) or fulfill the need(s)» 
Speech recognition on a centralized server or on a device is well known in the art. However, 
multi -level speech recognition involving recognition first , on a device and then on a server 
is considered to be new, — ^ f ^ " <ty 
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5. What is the invention being disclosed: 

This invention provides a method for multi- level, distributed speech recognition, where the 
device performs speech recognition first and if unable to do so successfully or completely, 
refers speech recognition to a remote server. This way, the client device can perform 
recognition whenever it can and the remote server is utilized only when necessary. 

6. How does this invention resolve the problem(s) and fulfill tie need(s) in, a new way: Attac/tantf 
drawings or <£pfframs yaii Jtttart ruccssajy far cCarifieutunu 

This invention enables speech recognition to be performed at multiple levels and enables a 
flexible, quick, cost-ef f icienc system for performing recognition on a device, a remote 
server, or on a combination of those. it also enables a device (i.e. phone) to access 
different services based on recognition (i.e. weather vs. nearest restaurant). 
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Systems that use speech recognition such as phone devices and 
servers such as MlX/Myosphere - 



N/A 



10 
11 

1) 

Z) 



Date of 
conception 

product(s) this 
invention may be used 

in : , , 

Date the first offer for sale was made fox a 
product incorp ox at ittfl this invention: ^ 
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Approvals: ^Technical Staff or Patent Li aison 2) ManagemftPt (both required) Signing form 
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14. What is the business impact of having a patent on this invention, fox Motorola and/or 
competition: 

This invention will help Motorola preserve and extend its advantage in speech enabled 
services by enabling fast, efficient, cost-effective speech recognition. 



15. Expanded description; list any additional details you 'feel would be helpful in describing 
the invention: 
(See attached) 



15. Additional details concerning the prior art related to this invention: 



Attach any baciup documents or provide any other information you feel would be helpful in 
determining the desirability of obtaining a patent on this invention. Any attachments that are 
critical to the disclosure of the invention should be witnessed. 
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Additional information! 

This invention provides a method for multi- level, distributed speech recognition, vhere the 
device performs speech recognition first and if unable to do so successfully or completely, 
refers speech recognition to a remote server. This way the client device can perform recognition 
whenever it can and the remote server is utilized only when necessary. 

The first aspect of the invention i$ to defer recognition to a server, in the event that the 
device cannot recognize the utterance (i.e. the grammar doesn't contain words in the utterance). 
An example would be when a user selects weather service on a phone (which i$ running on the 
device) and the user utters * Stockholm" , the phone's speech recognizer will try to recognize 
it. If it cannot recognize it because the phone's limited grammar does not contain 
* Stockholm" , the phone can forward the request to a remote server. This server will perform 
recognition on * Stockholm" and provide the recognized utterance back to the phone. Now, the 
phone's weather service can present the user with weather in Stockholm, (or the remote server 
can provide weather for Stockholm, instead) 

The second aspect of the invention is the ability to defer recognition to a server, if the 
device does not have sufficient processing power [CPU) or power. For example, when the device is 
connected to a laptop or to a telematics system in a vehicle (which provides higher processing 
capability) and/or a power connection, the device can perform speech recognition to a better 
extent than before. 

The third aspect of the invention is the ability to understand keywords and forward the 
utterance to an appropriate recognizer. For example, when a user utters u weather in 
Stockholm" > and the device recognizes * weather* (it's in the grammar) but not tt Stockholm" 
(it's not in the grammar), it knows to forward the request to the appropriate server (i.e. 
weather, and not the nearest ATM service) along with any context information based on the 
recognized word. 

Advantages/benefits ; • "\ 

1. The device can attempt to recognize an utterance, and if it cannot, it can forward the 
utterance to a more powerful, more feature-rich remote server for recognition. This makes the 
system more flexible. 

2. Speech recognition can be done in parallel in local device (phone) as well as in remote 
server (MIX). This can improve recognition probability (two is better than one). 

3. Local device can recognize parts of utterance and use that to direct to appropriate server, 
(i.e. local device recognizes n weather" in * weather at Stockholm" and uses that to direct 
the request to a remote server (European weather service) - 

4. If device doesn't have enough power and or CPU (i.e. laptop running on battery) to support 
full-featured recognition, it can forward the utterance to a remote server for recognition 
and other power intensive tasks. If device has sufficient power and CPU" (i.e. laptop with 
power connection), it can perform full-featured recognition by itself. 

5. Local device can recognize part of utterance and process that information, forward 
unrecognized parts to a remote server for processing, and combine processed information later 
for efficient service. This can potentially reduce cost (airtime, etc) and processing time* 

Example Is 
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User 




O User calls MIX via voice and utters * weather at Stockholm" to request the current 
weather conditions in Stockholm, Sweden. 

© Phone/device performs speech recognition, recognises sufficient information to identify 
service request (weather) , but the grammar doesn't include Stockholm so it cannot 
recognize the latter part (* Stockholm" ). 

© Device passes request to appropriate remote server (In this case, weather). This could 
be a separate server for weather only. Or it can be. a common server but context 
relevant information is also passed on (in this case the context is that the user is 
looking for weather information), which can be used for such purposes as loading the 
correct grammar (the weather grammar consisting of city names instead of a grammar of 
store locations). 

© Remote server performs speech recognition on request (* Weather at Stockholm" ) or the 
latter portion of the request (* Stockholm' ) and provides weather conditions in 
Stockholm, Sweden to the user. 

© Alternatively, the remote server can pass back the recognized word to the weather 
service on the phone. 



Example 2: 
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^ START ^ 



User utters "Weather, 
Phone/Device recognizes utterance 
Invokes weather application 



Prompt user for city 
User utters "Stockholm" 



Maybe in Grammar, 
but didn't recognize/ 
Medium confidence 




\ Recognized/ 
confidence 



N 



Forward utterance to remote server 




r 



Is remote server able to 
recognize utterance? 



Forward recognized utterance to 
device; device presents weather 
Or 

directly present weather to user 



< 



Present Information to user 




f 



END 
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